A framework for simulating genotype-by-environment interaction using multiplicative models

Key message The simulation of genotype-by-environment interaction using multiplicative models provides a general and scalable framework to generate realistic multi-environment datasets and model plant breeding programmes. Abstract Plant breeding has been historically shaped by genotype-by-environment interaction (GEI). Despite its importance, however, many current simulations do not adequately capture the complexity of GEI inherent to plant breeding. The framework developed in this paper simulates GEI with desirable structure using multiplicative models. The framework can be used to simulate a hypothetical target population of environments (TPE), from which many different multi-environment trial (MET) datasets can be sampled. Measures of variance explained and expected accuracy are developed to tune the simulation of non-crossover and crossover GEI and quantify the MET-TPE alignment. The framework has been implemented within the R package FieldSimR, and is demonstrated here using two working examples supported by R code. The first example embeds the framework into a linear mixed model to generate MET datasets with low, moderate and high GEI, which are used to compare several popular statistical models applied to plant breeding. The prediction accuracy generally increases as the level of GEI decreases or the number of environments sampled in the MET increases. The second example integrates the framework into a breeding programme simulation to compare genomic and phenotypic selection strategies over time. Genomic selection outperforms phenotypic selection by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim $$\end{document}∼50–70% in the TPE, depending on the level of GEI. These examples demonstrate how the new framework can be used to generate realistic MET datasets and model plant breeding programmes that better reflect the complexity of real-world settings, making it a valuable tool for optimising a wide range of breeding methodologies. Supplementary Information The online version contains supplementary material available at 10.1007/s00122-024-04644-7.

Sup. Fig. 1: Parameters responsible for the structure of the between-environment genetic correlation matrix, C e .The parameters (, , , ) were altered, while keeping the others constant ( = 0,  = 1 − ,  = 0,  = 7).Different between-environment genetic variance matrices, G e , were then obtained by multiplying each C e with D e in Fig. 2 of the manuscript using Eq. 5, from which measures of variance explained were calculated as labelled above each histogram.The vertical blue line represents the mean genetic correlation between environments (also labelled).
The proportion of main effect variance is given by Eq. 7 while the expected main effect accuracy in the TPE is given by Eq. 12, which is equal to the expected main effect accuracy in the MET (Eq.13) multiplied by the expected MET-TPE alignment (Eq.14).The measures of accuracy are presented in the following figures for 1000 simulated MET datasets with low, moderate and high GEI.Note that the MET-TPE alignments in these figures are built using Eq. 13, and are therefore different to those presented in Fig. 5 of the manuscript which are built using the empirical correlation between the true genotype main effects in the TPE and those sampled in each MET dataset.

Low GEI
Sup. Fig. 3a: True simulation parameters for 1000 MET datasets with 5, 10, 20 or 50 environments sampled from a TPE with low GEI.The top panel presents the proportion of main effect ( ! ) and interaction ( !* ) variances, proportion of noncrossover ( + ) and crossover ( , ) variances and the plot-level heritability (H % ).The middle panel presents the mean genetic ( F  !% ) and error ( ) % ) variances.The bottom panel presents the main effect accuracies in the TPE ( ! ) and MET ( ( ), the MET-TPE alignment ( (-) and the accuracy of the GE effects in the MET ( !* ) per Eqs.12-16 of the manuscript.Note: The crosses represent the expected values per Tbl. 2. Sup.Fig. 3b: True simulation parameters for 1000 MET datasets with 5, 10, 20 or 50 environments sampled from a TPE with moderate GEI.The top panel presents the proportion of main effect ( ! ) and interaction ( !* ) variances, proportion of noncrossover ( + ) and crossover ( , ) variances and the plot-level heritability (H % ).The middle panel presents the mean genetic ( F  !% ) and error ( ) % ) variances.The bottom panel presents the main effect accuracies in the TPE ( ! ) and MET ( ( ), the MET-TPE alignment ( (-) and the accuracy of the GE effects in the MET ( !* ) per Eqs.12-16 of the manuscript.Note: The crosses represent the expected values per Tbl. 2.

envs
10 envs 20 envs 50 envs      environments sampled from a TPE with high GEI.The top panel presents the proportion of main effect ( ! ) and interaction ( !* ) variances, proportion of noncrossover ( + ) and crossover ( , ) variances and the plot-level heritability (H % ).The middle panel presents the mean genetic ( F  !% ) and error ( ) % ) variances.The bottom panel presents the main effect accuracies in the TPE ( ! ) and MET ( ( ), the MET-TPE alignment ( (-) and the accuracy of the GE effects in the MET ( !* ) per Eqs.12-16 of the manuscript.Note: The crosses represent the expected values per Tbl. 2.
Expected main effect accuracy in the TPE and MET, and the expected MET-TPE alignment for different proportions of genotype main effect variance ( ! ) and different numbers of environments sampled in the MET dataset  ( .Highlighted are the expected trajectories for  ( = 5, 10, 20 and 50 environments.The overall plot-level heritability is H % = 0.3 (i.e.F  !% = 1.47 and  ) % = 3.44), with two replicates per environment ( = 2).